perm filename GALLAG.LE1[ESS,JMC]1 blob
sn#059034 filedate 1973-08-19 generic text, type C, neo UTF8
COMMENT ā VALID 00002 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 Dear Mr. Gallagher:
C00008 ENDMK
Cā;
Dear Mr. Gallagher:
In accordance with our conversations, this is a request for
a grant of $10,000 from the Associated Press to Stanford University
for an approximately one year study of improved methods for
maintaining, automatically indexing, and using a computerized news
data base. A major goal of the work will be to see if a completely
automatic storage and indexing system can provide a useful and
cost-effective service.
It seems to me that this can further your interest in having
an automatic library for the A.P. staff and our interest in
experimenting with services for home, university and business use.
Our present ideas for improving our A.P. news data base
include the following:
1. Index on all non-trivial words in the stories. We think
this will not be too expensive in computer time and storage, but we
aren't sure yet.
2. Allow category names that catch all occurences of words
in the category, e.g. the category "animal" would collect stories
containing the words "dog", "cat", etc.
3. Catch inflected forms of words, e.g. a reference to
"senator" would catch the word "senators".
4. Increase the size of the data base. This is important,
because bigger data bases will require better filtering methods in
order to search them usefully. At Mr. Bowen's suggestion which
seems quite reasonable to us, we plan to store the A-wire
continuously for perhaps a year. This will require substantial disk
storage; a year of A-wire will cost about $750 per month rental from
IBM to store. Devising methods of eliminating redundant information
may reduce this requirement.
For other experiments, it may be worthwhile to put some
other wires into the machine. We have in mind the California
regional wire, Dataspeed, and the B-wire.
5. Study the utility of the system. Our subjective
impression and the impression gained from ARPA net use is that the
system is useful in keeping up with a days news even in its present
form. However, it really requires more formal testing. We suggest
the following:
a. A.P. may wish to make it available internally for
experimental use and for suggestions. You should equip yourselves
with a 30 character per second printer like that made by Texas
Instruments and prepare to suffer a substantial long distance
telephone bill. If competition with other users of our two input
ports proves annoying, a priority port can be provided at some
expense.
b. Stanford faculty in communications and political
science departments that use current news may be able to help
evaluate the system if we provide a terminal for them.
c. If you approve, we would like to test the systems
utility to a business firm. I have in mind the headquarters of the
Bank of America in San Francisco, because I know someone there. I
haven't discussed it with them, but it would be interesting to see
if they would find it worth (say) $175 per month plus the cost of
renting a suitable terminal.
It is expected that the work will lead to a PhD thesis
covering this and other matters and that a report of the results
will be published in some computer journal.
Our proposed budget is as follows:
This proposal has the approval of the Administration of
Stanford University.